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Abstract. Image feature classification is a challenging problem in many 
computer vision applications, specifically, in the fields of remote sensing, 
image analysis and pattern recognition. In this paper, a novel Self Orga¬ 
nizing Map, termed improved SOM ( iSOM ), is proposed with the aim 
of effectively classifying Mammographic images based on their texture 
feature representation. The main contribution of the iSOM is to intro¬ 
duce a new node structure for the map representation and adopting a 
learning technique based on Kohonen SOM accordingly. The main idea 
is to control, in an unsupervised fashion, the weight updating procedure 
depending on the class reliability of the node, during the weight up¬ 
date time. Experiments held on a real Mammographic images. Results 
showed high accuracy compared to classical SOM and other state-of-art 
classifiers. 
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1 Introduction 

Image feature classification m presents a challenge in many computer vision 
applications. It plays a significant role in the fields of remote sensing, image 
analysis and pattern recognition. Recently, content-based image classification 
and retrieval received increasing attention through numerous applications [3] 
in the field of education, entertainment, military and biomedicine. With the 
enormous growth of computational power, image retrieval/classification have 
become more demanding in the area of computer vision. However, the success 
of solving such problems lies in the issues of object-based image understanding, 
proper representation of image contents and suitable learning algorithms. 

The Self-Organizing Map (SOM) |4lf)| (also called Kohonen network) is an 
artificial unsupervised network characterized by the fact that its neighbouring 
neurons develop adaptively into specific detectors of different vector patterns. 
The neurons become specifically tuned to various classes of patterns through a 
competitive, unsupervised and self organizing learning. The spatial location of 
a neuron in the network (given by its coordinates) corresponds to a particular 
input vector pattern. Similar input vectors correspond to the same neuron or 
to neighbour neurons. One important characteristics of SOM is that it can si¬ 
multaneously perform the feature extraction and it performs the classification 
as well [5]. 
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In the medical field, SOM has been used extensively in an efficient and ef¬ 
fective way. In [7] a classification methods based on multilayer perceptrons and 
Kohonen self-organizing map classifiers for image data to identify Alzheimer’s 
disease. Starting from the idea to consider the SOM as a cell characterizing a 
specific class only, Victor presents In [5] a new neural classification model called 
Concurrent Self-Organizing Maps ( CSOM ), representing a winner-takes-all col¬ 
lection of small SOM networks. Each SOM of the system is trained individually 
to provide best results for one class only. The CSOM model proved to have bet¬ 
ter performances than SOM , both for the recognition rate and also for reduction 
of the training time. In [3] Tommy proposes a new image classification approach 
through a tree-structured feature set. In this approach, the image content is 
organized in a two-level tree, where the root node at the top level represents 
the whole image and the child nodes at the bottom level represent the homo¬ 
geneous regions of the image. The tree-structured representation combines both 
the global and the local features through the root and the child nodes. The tree- 
structured feature data are then processed by a two-level self-organizing map 
(SOM), which consists of an unsupervised SOM for processing image regions 
and a supervising concurrent SOM (CSOM) classifier for the overall classifica¬ 
tion of images. DAR-REN et al. |9] applies the self-organizing maps (SOMs) to 
classify the benign and malignant sonographic breast lesions. 

The classical SOM and most of its variations rely on a fully unsupervised 
learning procedure. This is because of the node structure of the map does not 
provide any possibility of utilizing the class label while training the SOM re¬ 
sulting in unstable behaviour when classifying pattern, especially in the real 
data with the presence of the noise and artifacts. Moreover, for the classification 
purpose, mapping can result in divided clusters because it requires that nearby 
points behave similarly. Motivated by the issues above, a simple but effective 
improvement in the classical SOM is proposed in this paper aiming to integrate 
and take advantage of the probability of a particular node to be a winner by a 
voting criteria. 

The paper is organized as follows. Section [2] provides a brief discussion on 
the advantages of using SOM for the classification over the traditional classifi¬ 
cation models. In Section [3] an effective texture representation method is briefly 
discussed. Section [I] provides a discussion on the proposed model and its im¬ 
plementation. Then, an experimental study and conclusions with some possible 
future directions are provided in Sections [5] and [6l receptively. 

2 ^OM-based Classifier 

The SOM neural network is one of the most popular unsupervised neural network 
models, which simultaneously performs a topology-preserving projection from 
the data space onto a regular two-dimensional grid [4]. There are some reasons 
to use a SOM as a classifier: (i) Weights representing the solution are found by 
iterative training, (ii) SOM has a simple structure for physical implementation 
and interpretation, (iii) SOM can easily map large and complex distributions 
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and (iv) generalization property of the SOM produces appropriate results for 
the input vectors that are not present in the training set m- 

A basic SOM network is composed of an input layer, an output layer, and 
network connection layer. The input layer contains neurons for each element 
in the input vector. The output layer consists of neurons that are located on 
a regular, usually two-dimensional grid and are fully connected with those at 
the input layer. The network connection layer is formed by vectors, which are 
composed of weights in the input and output layer. 

The neurons in the map are connected to adjacent ones by a neighbourhood 
relation dictating the topological structure of the neurons. Each neuron i in 
the map is represented by an n-dimensional weight or reference vector Wi = 
[nil,..., w n ] T , where n is equal to the number of neurons in the input layer. 

When an input vector x £ R n is presented to the network, the neurons 
in the map compete with each other to be the winner (or the best-matching 
unit, BMU) 6, which is the closest to the input vector in terms of some kind of 
dissimilarity measure such as Euclidean distance as follows, 

\\x - w b \\ = min{\\x - Wi\\} (1) 

During training session, weights of neurons are topologically arranged in the 
map within a certain geometric distance and are moved toward the input x using 
the self-organization learning rule as represented in formula below : 

Wi(t + 1 ) = Wi{t) + r]hbi{t)[x(t) - Wi(t )] ( 2 ) 

where t = 0,1,2,3,... is the time lag, is a small positive learning rate and 
hbi(t) is the neighborhood kernel around the BMU b at time t. In general, hu(t) 
can be defined as 

h C i(t) = h(\\r c -n\\,t) (3) 

where r c ,ri £ R 2 are the location vectors of neurons c and i. respectively, and 
when || r c — r,;|| increases, h C i decreases to zero gradually. This leads to local 
relaxation or smoothing effects on the weight vectors of neurons in the neigh¬ 
bourhood of the BMU. Therefore, similar input vectors are grouped into a single 
neuron or neighbouring ones in the map when learning is accomplished. 

SOM has been used as classifier by projecting the data vectors belong to 
higher dimensional input space n into m many code-book vectors of size n or¬ 
ganized in a two dimensional lattice structure. SOM provides two fundamental 
issues: the first is the clustering of data and the second is the relationship between 
the clusters. The clustering is an unsupervised learning while the relationship 
between clusters can be seen in the planar surface by checking the distances be¬ 
tween the code-book vectors. Although it is difficult to deduce exact relationship 
between those, since the code-book vector size is much greater than the planar 
surface size of 2, this gives us an insight about the classification regions. What is 
proposed here is totally different from what was proposed in the previous works, 
a novel training algorithm for SOM as well as a new node structure to meet the 
proposed training is introduced. Another point is to apply the algorithm in a set 
of already selected features. 
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3 Mammographic Image representation 

In this section, a short discussion of an image texture feature extraction method 
is provide for effective Mammographic image representation. 

The Grey Level Co-occurrence matrix ( GLCM ) provides a full texture de¬ 
scription of an image in a statistical fashion. Simply, the GLCM technique com¬ 
putes first the probability of co-occurrence between two grey levels i and j given 
a relative orientation and distance for all possible co-occurring grey level pairs 
in an image window. Then, a set of selected statistics are applied to the entire 
matrix to calculate the texture features. In this work, the four most commonly 
and practically used features (e.g., Dissimilarity, uniformity, entropy and con¬ 
trast) [11112113] have been selected for the model evaluation. The following four 
statistics will be used exclusively in this work: 

1. Dissimilarity = J2i,j=i Qj I* ~ 3l 

2. Uniformity = £^ =1 C ij- 

3. Entropy = - Ylf,j =l C H *g C H- 

4. Contrast = i - j) 2 - 

Where, Cij represents co-occurring probabilities stored inside GLCM. G repre¬ 
sents number of grey level available. 

For a more accurate feature extraction and a further investigation of the lo¬ 
calization property of the represented features, the process of extracting textural 
information from Mammographic image depends on first identifying the object 
of interest as a reprocessing step. As a consequence, However, it is required to 
segment the images first as a pre-processing step before the feature extraction 
process. As a consequence, a bloc wise partitioning method HU is used in this 
work, which can be described as follows (see Figure [T]): 

1. If the image contains inhomogenity regions then a set of SN seeds are au¬ 
tomatically selected and their associated regions are growing in a similar 
way to [THinn . Otherwise, divide the entire image into SN non-overlapping 
sub-images SI = {/i, I 2 , ■ ■ ■, Isn}- 

2. Split each of these SN sub-images into other M blocks Ij = {B i, B 2 , ■.., Bm}, 
J = 1,2,..., SN. 

3. For each bloc Bi 1 i = 1,2,..., M, construct a bloc representing set of texture 
feature vectors. 

4. Use the fc-means algorithm to cluster the feature vectors into several classes 
for each sub-image I independently. 

5. For each cluster in A, i = 1,2,, SN, construct a sub-image representing 
set of texture feature vectors Fk = {/i, fi, ■ ■ ■, fx}> k = 1,2 ,,L; where 
L is the number of classes each of which contains A' texture features. 

6. Build the final set of texture features representing the overall image in 
the form of a single transaction of the final dataset (set of images),T) = 
{ti,t 2 , ■ ■ ■ ,t c }, where c is the number of images, A is a vector of the size 
(SN x L x X), i = l,2,...,c. 

7. For each T) = {t\,t 2 ,..., t c } add the class label of its image. 
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Fig. 1. The architecture of bloc wise feature extraction method with SN = 6, M = 6 
and L = 3 for simplification. 


4 iSOM Classifier 


As mentioned in previous sections, SOM is designed to be unsupervised learning 
technique. We here enhance the SOM network to be used for supervised learning 
(classification) by introducing a new node structure and an enhanced learning 
utilizing the class label in the weight update step. 

First, every node is represented with the set of connection weights w = 
{wo; wi ,..., w n } where n is the number of attributes, and a set of winning class 
counters ( WCC m ) c m = (ci, C 2 ,..., c m } where m is the number of classes, this 
representation provides the possibility of utilizing the class label provided in 
the training set while training the SOM. We can simply say the vector WCC , 
is introduced to the node structure to provide a voting criteria, so such nodes 
with maximum WCCi are pulled during the weight update process. Shifting such 
nodes towards the BMU which is definitely of the same class increases the means 
of relationship between such nodes. At the same time leaving nodes from other 
class dims the relationship between such nodes and their un-similar neighbours. 

In every iteration after computing the distance between the input vector and 
SOM elements using [1] the winning node is activated, this step is typically as 
proposed in classical Kohonen SOM, even though several distance function can 
be used, the main idea is to measure the similarity between objects independently 
from the data. 

The next step after identifying the BMU is to increase the WCCi by one for 
the i th class accordingly. This increment gives more confidence that this node is 
targeted by an example of class i, this confidence indicates a similarity between 
both the input example and the winning node. 

At the final stage what we call a constrained weight update is performed, the 
problem with traditional SOM is that all the neighbours are blindly attracted 
or pulled towards the winning node, the term constrained means selecting some 
node which are fitted to the criteria that is clear here, those nodes which are 
mostly targeted by examples from the same class are suppose to belong to the 
same cluster, and those nodes not mostly targeted by examples from other classes 
should come closer to this cluster, so they are left. 
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We can simply express this as, when a winning node is activated, and before 
performing the weight update, a vote is conducted, the only set of nodes in the 
neighbourhood with maximum class counter that is equal to the current instance 
class label will be considered as neighbour nodes. For an instance Ej belongs 
to class Ci, X T = Aiax\\WCC\\, where Max is a function that returns the set 
of nodes with maximum WCCi, finally the weight update given in [2] will be 
performed over X, at the same the WCCi will be increased by one. 

Figure [2] illustrate the learning process of SOM based on the proposed node 
structure. Every node is represented with the set of connection weights, and a 
set of winning class counters such that with maximum WCC (black nodes) are 
pulled during the weight update process. Shifting such nodes towards the BMU 
which is definitely of the same class increases the means of relationship between 
such nodes. At the same time leaving nodes (grey nodes) from other class dims 
the relationship between such nodes and their un-similar neighbours. 



% selected node 
O ignored node 


Fig. 2. The architecture of iSOM\ the proposed node structure and weight updating 
process. 


the rest of the algorithm will go according to the classical SOM explained 
the previous section, including the weight decrease given in[3l for relaxation. 

Selecting nodes which are supposed to be related to the winning node and 
update their weights towards the winning node and leaving those nodes that are 
most probably belong to different class, could enhance training in terms of time 
as well as generated model quality. 


Implementation 

1. Randomly set the initial values for all the connection weights. 

2. For every node in the grid set the class counters to zero. 

3. If training set is not empty, select an instance from the training set and go 
to next step else go to step 7. 

4. Compute the distance between the selected instance and every node in the 
network and select the wining node accordingly. 

5. For every node in the neighbourhood 

- Select nodes where the maximum class counter is the same as the winning 
node. 

- Update weights for the selected nodes in previous step. 
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- Increase the corresponding class counter of the node. 

6. Go to step 3. 

7. If no of epochs > LIMIT then exit, Else go to step 2. 


Table 1. The classification rate in terms of Precision, Recall and F-score by the iSOM 
and SOM models. 


SOM map size iSOM model SOM model 

Precision Recall F-score Precision Recall F-score 


10 x 10 

94.73 

76.92 

84.9 

80.39 

79.12 

79.74 

15 x 15 

94.33 

88.76 

91.46 

90.9 

79.59 

84.86 

20 x 20 

97.87 

85.26 

91.13 

98.18 

93.1 

95.57 

25 x 25 

100 

90.84 

95.2 

92.1 

75.96 

83.25 


5 Experimental Results 


To demonstrate quantitatively the accuracy of the iSOM model in the classifica¬ 
tion, we have used the Precision, Recall , and F -score metrics Such that 10-fold 
cross validation process is used. They are defined as follows: 


Precision = 

Recall = 


TP 


TP + FP’ 
TP 


F -score = 


TP + FN ’ 
2 PR 


P + R ’ 


( 4 ) 

( 5 ) 

( 6 ) 


where TP, FP, and FN represent, resp., the numbers of true positive (abnor¬ 
mal), false positive, and false negative (normal) foreground pixels. We also have 
compared the accuracy with SOM model and other classifiers. In this experi¬ 
ment 142 images from the mini-MIAS database of mammograms data set HZ] are 
selected randomly for model testing and evaluation. The sample is distributed 
between classes as follows: Normal class (n=60), and abnormal class (n=82). 

Table [1] illustrates comparison between classical SOM and enhanced SOM 
classifier to classify images based on image feature set extracted using textural 
extraction method based on our Bloc-wise ROI selection method described in 
section [3] with SN = 6, M = 8 and L = 3. 

Comparison between the classical SOM and iSOM shows better performance 
in terms of precision, recall as well as the f-score with different map sizes. We 
can also say that the time consumed in training both the networks differs and 
again the enhanced SOM wins, as the number of update operations performed 
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by the enhanced SOM is much less compared to the classical SOM due to the 
proposed constrained weight update. One more point could be concluded from 
the experiment indicates the enhanced SOM could reach the best accuracy in 
less time as we mentioned, as well as with less memory consumed if you simply 
look to the map size. 

The proposed iSOM couldn’t only show high accuracy when compared to 
the classical SOM but comparing with other famous classifiers in the field like 
Bayesian, Radial bases function network and others using weak experimenter, 
see Table [2] 


Table 2. The classification accuracy of well-known classifiers. 


Classifier 

Precision 

Recall 

F-score 

RBF 

58.76% 

93.33% 

68.31% 

Simple Logistic 

76% 

76.08% 

76.05% 

Bagging 

87.50% 

75.49% 

78.87% 

J48 

93.10% 

92.85% 

92.95% 

NaiveBayes 

58.97% 

78.12% 

67.60% 


6 Conclusion and Future Work 

In this paper, four texture features derived from the co-occurrence matrix was 
used. For this, a textural extraction method based on accurate ROI selection 
for obtaining efficient image representation has been utilized. 

The main characteristic of the SOM -based classifier is the conservation of 
the topology: after learning, close observations are associated to the same class 
or to close classes according to the definition of the neighbourhood in the SOM 
network. This feature allows considering the resulting classification as a good 
starting point for further developments. The paper presents a novel classifier 
inspired by the classical SOM , termed improved SOM ( iSOM ), by introducing 
a new node structure and adopting the underlying self organizing learning pro¬ 
cedure such that for the same number of neurons, iSOM has better recognition 
performances than SOM. Experimental results confirm the good performance of 
the iSOM when compared to other state-of-art classifiers. 

As a future development, There are several research directions such as: 1) 
consider several and more effective image-based features (e.g., global information 
and other kind of local information), more samples for the evaluation and more 
prototype-based algorithms for the comparison; 2) develop iSOM to be working 
on a parallel fashion by applying more than 2 iSOMs in a concurrent fashion 
and cope with the multi-class classification problems; 3) besides the use of more 
than 2 iSOMs, we mention the possibility of extending the proposed model such 
that the underlying neurons are incrementally added/removed and trained to 
overcome the limitation of manually adapting the topology of the network. 
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